AITopics | synchronous training

Collaborating Authors

synchronous training

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

972cd27c994a806e187ef1c2f5254059-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-16-2026, 00:45:11 GMT

dropcompute, large language model, machine learning, (20 more...)

Neural Information Processing Systems

Country:

North America > United States > Virginia (0.04)
North America > United States > Texas > Travis County > Austin (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
(2 more...)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)

Add feedback

DropCompute: simple and more robust distributed synchronous training via compute variance reduction

Neural Information Processing SystemsFeb-16-2026, 00:45:07 GMT

Thus, these methods are limited by the delays caused by straggling workers.

dropcompute, large language model, machine learning, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > Virginia (0.04)
North America > United States > Texas > Travis County > Austin (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)

Add feedback

be0a8ecf8b2743a4117557c5eca0fb79-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-11-2026, 16:32:33 GMT

gradient, hop-bw hop-bs bsp, synchronous training, (14 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.46)

Add feedback

RecommendationModels

Neural Information Processing SystemsFeb-11-2026, 16:32:29 GMT

Although synchronous AR training is designed to have higher training efficiency,asynchronous PStraining would beabetter choice for training speed when there are stragglers (slow workers) in the shared cluster, especially under limited computing resources.

artificial intelligence, machine learning, staleness, (17 more...)

Neural Information Processing Systems

Country:

Europe > Czechia > Prague (0.05)
Asia > Middle East > Jordan (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

DropCompute: simple and more robust distributed synchronous training via compute variance reduction

Neural Information Processing SystemsDec-26-2025, 09:29:57 GMT

Background: Distributed training is essential for large scale training of deep neural networks (DNNs). The dominant methods for large scale DNN training are synchronous (e.g. All-Reduce), but these require waiting for all workers in each step. Thus, these methods are limited by the delays caused by straggling workers.Results: We study a typical scenario in which workers are straggling due to variability in compute time. We find an analytical relation between compute time properties and scalability limitations, caused by such straggling workers. With these findings, we propose a simple yet effective decentralized method to reduce the variation among workers and thus improve the robustness of synchronous training. This method can be integrated with the widely used All-Reduce. Our findings are validated on large-scale training tasks using 200 Gaudi Accelerators.

compute variance reduction, dropcompute, synchronous training, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.61)

Add feedback

972cd27c994a806e187ef1c2f5254059-Supplemental-Conference.pdf

Neural Information Processing SystemsOct-9-2025, 02:01:34 GMT

dropcompute, large language model, machine learning, (20 more...)

Neural Information Processing Systems

Country:

North America > United States > Virginia (0.04)
North America > United States > Texas > Travis County > Austin (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
(2 more...)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)

Add feedback

DropCompute: simple and more robust distributed synchronous training via compute variance reduction

Neural Information Processing SystemsOct-9-2025, 02:01:30 GMT

Thus, these methods are limited by the delays caused by straggling workers.

dropcompute, large language model, machine learning, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > Virginia (0.04)
North America > United States > Texas > Travis County > Austin (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)

Add feedback

be0a8ecf8b2743a4117557c5eca0fb79-Supplemental-Conference.pdf

Neural Information Processing SystemsAug-18-2025, 11:00:03 GMT

artificial intelligence, gradient, machine learning, (15 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.46)

Add feedback

GBA: AT uning-free Approach to Switch between Synchronous and Asynchronous Training for Recommendation Models

Neural Information Processing SystemsAug-18-2025, 10:59:59 GMT

Although synchronous AR training is designed to have higher training efficiency, asynchronous PS training would be a better choice for training speed when there are stragglers (slow workers) in the shared cluster, especially under limited computing resources.

artificial intelligence, machine learning, training mode, (17 more...)

Neural Information Processing Systems

Country:

Europe > Czechia > Prague (0.04)
Asia > Middle East > Jordan (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Data Science (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)

Add feedback

Joint Training And Decoding for Multilingual End-to-End Simultaneous Speech Translation

Huang, Wuwei, Jin, Renren, Zhang, Wen, Luan, Jian, Wang, Bin, Xiong, Deyi

arXiv.org Artificial IntelligenceMar-14-2025

Recent studies on end-to-end speech translation(ST) have facilitated the exploration of multilingual end-to-end ST and end-to-end simultaneous ST. In this paper, we investigate end-to-end simultaneous speech translation in a one-to-many multilingual setting which is closer to applications in real scenarios. We explore a separate decoder architecture and a unified architecture for joint synchronous training in this scenario. To further explore knowledge transfer across languages, we propose an asynchronous training strategy on the proposed unified decoder architecture. A multi-way aligned multilingual end-to-end ST dataset was curated as a benchmark testbed to evaluate our methods. Experimental results demonstrate the effectiveness of our models on the collected dataset. Our codes and data are available at: https://github.com/XiaoMi/TED-MMST.

speech translation, target language, translation, (13 more...)

arXiv.org Artificial Intelligence

2503.1108

Country: